Useful Estimates of Assay Performance from Small Data Sets
نویسندگان
چکیده
منابع مشابه
Entropy estimates of small data sets
Abstract. Estimating entropies from limited data series is known to be a non-trivial task. Näıve estimations are plagued with both systematic (bias) and statistical errors. Here, we present a new “balanced estimator” for entropy functionals (Shannon, Rényi and Tsallis) specially devised to provide a compromise between low bias and small statistical errors, for short data series. This new estima...
متن کاملPerformance of Data Structures for Small Sets of Strings
Fundamental structures such as trees and hash tables are used for managing data in a huge variety of circumstances. Making the right choice of structure is essential to efficiency. In previous work we have explored the performance of a range of data structures—different forms of trees, tries, and hash tables—for the task of managing sets of millions of strings, and have developed new variants o...
متن کاملCorrecting MM estimates for "fat" data sets
Regression MM estimates require the estimation of the error scale, and the determination of a constant that controls the e¢ ciency. These two steps are based on asymptotic results that are derived assuming that the number of predictors p remains xed while the number of observations n tends to in nity, which means assuming that the ratio p=n is small. However, many high-dimensional data sets ...
متن کاملPart-of-Speech Tagging from "Small" Data Sets
Probabilistic approaches to part-of-speech (POS) tagging compile statistics from massive corpora such as the Lancaster-Oslo-Bergen (LOB) corpus. Training on a 900,000 token training corpus, the hidden Markov model (HMM) method easily achieves a 95 per cent success rate on a 100,000 token test corpus. However, even such large corpora contain relatively few words and new words are subsequently en...
متن کاملFeature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach
Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Clinical Chemistry
سال: 2004
ISSN: 0009-9147,1530-8561
DOI: 10.1373/clinchem.2004.036996